# Interpretable Diffusion Via Information Decomposition
This paper illuminates the fine-grained relationships learned by diffusion
models by noticing a precise relationship between diffusion and information
decomposition. We exploit these new relations to measure the compositional understanding of diffusion
models, to do unsupervised localization of objects in images, and to measure
effects when selectively editing images through prompt interventions.

$$
\mathfrak{i}^o(x;y) \equiv \frac{1}{2} \int \mathbb E_{p(\epsilon)} \left[|| \hat\epsilon_\alpha(x_\alpha) - \hat\epsilon_\alpha(x_\alpha | y) ||^2 \right] d\alpha
$$


## Environmnet Setup
Install dependencies by running:
```install dependencies
conda create --name infodp python=3.10
conda activate infodp
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
cd Info_Decomp_Sample
pip install -e .
```

## Dataset Setup
Download all datasets by running:
```download datasets
mkdir ./datasets # make sure the repo have this directory and corresponding JSON file.
wget http://images.cocodataset.org/zips/val2017.zip -O coco_val2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O coco_ann2017.zip
unzip coco_val2017.zip -d coco/
rm coco_val2017.zip
unzip coco_ann2017.zip -d coco/
rm coco_ann2017.zip
```

## Evaluation
After downloading the datasets, navigate to './Info_Decomp' as working directory, then run:

### Unsupervised Word Localization
1. evaluate ITDM ('nll_2D' example, see more in ```python ./scripts/eval_itd.py --help```)
```eval itd
python ./scripts/eval_itd.py --res_out_dir './results/itd' --data_in_dir './datasets/coco' \
--sdm_version 'sdm_2_1_base' --n_samples_per_point 1 --num_steps 100 --batch_size 10 \
--logsnr_loc 1.0 --logsnr_scale 2.0 --clip 3.0 \
--upscale_mode 'bilinear' --int_mode 'logistic' \
--csv_name 'COCO-IT' --res_type 'nll_2D' --eval_metrics 'cmi' --seed 42
```

## Pre-trained Models
We adopted the pre-train [Stable Diffusion v2-1-base model card](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) available at Huggingface.


## Reference
- HuggingFace's [diffusers](https://github.com/huggingface/diffusers) library.
- Raphael Tang's [DAAM](https://github.com/castorini/daam) GitHub repository.
- Nan Liu's [Unsupervised-Compositional-Concepts-Discovery](https://github.com/nanlliu/Unsupervised-Compositional-Concepts-Discovery) GitHub repository (especially, daam_ddim_visualize.py).
- Mert Yuksekgonul's [vision-language-models-are-bows](https://github.com/mertyg/vision-language-models-are-bows) GitHub repository.